Goto

Collaborating Authors

 spatial map





Supplementary Material: Simultaneous embedding of multiple attractor manifolds in a recurrent neural network using constrained gradient optimization

Neural Information Processing Systems

The dynamics of neural activity are described by a standard rate model. Note that only the third term of Eq. 'th place cell preferred firing position in the's are standard unit vectors spanning an orthonormal basis. To derive Eq. 3 we evaluate the derivative of Energy landscapes were uniformly shifted throughout the manuscript by a constant (Figs. For each network with a different number of total embedded maps, 15 realizations were performed in which the permutations between the spatial maps were chosen independently and at random. Code availability Code is available at public repository https://doi.org/10.5281/zenodo.10016179.



existence of multiple representations of the same environment for a few sample neurons, we performed hypothesis tests for multiple

Neural Information Processing Systems

We thank all reviewers for their careful reviews and many positive comments. We feel that the typos and minor issues are easily addressable and will be corrected. We will incorporate this analysis into a revision of the paper. We thank R1 for bringing this highly related work to our attention. That work focuses on environments for which mice have previously developed spatial maps.


CoPatch: Zero-Shot Referring Image Segmentation by Leveraging Untapped Spatial Knowledge in CLIP

An, Na Min, Kang, Inha, Lee, Minhyun, Shim, Hyunjung

arXiv.org Artificial Intelligence

Spatial grounding is crucial for referring image segmentation (RIS), where the goal of the task is to localize an object described by language. Current foundational vision-language models (VLMs), such as CLIP, excel at aligning images and text but struggle with understanding spatial relationships. Within the language stream, most existing methods often focus on the primary noun phrase when extracting local text features, undermining contextual tokens. Within the vision stream, CLIP generates similar features for images with different spatial layouts, resulting in limited sensitivity to spatial structure. To address these limitations, we propose \textsc{CoPatch}, a zero-shot RIS framework that leverages internal model components to enhance spatial representations in both text and image modalities. For language, \textsc{CoPatch} constructs hybrid text features by incorporating context tokens carrying spatial cues. For vision, it extracts patch-level image features using our novel path discovered from intermediate layers, where spatial structure is better preserved. These enhanced features are fused into a clustered image-text similarity map, \texttt{CoMap}, enabling precise mask selection. As a result, \textsc{CoPatch} significantly improves spatial grounding in zero-shot RIS across RefCOCO, RefCOCO+, RefCOCOg, and PhraseCut (+ 2--7 mIoU) without requiring any additional training. Our findings underscore the importance of recovering and leveraging the untapped spatial knowledge inherently embedded in VLMs, thereby paving the way for opportunities in zero-shot RIS.


Semantic Mapping in Indoor Embodied AI -- A Comprehensive Survey and Future Directions

Raychaudhuri, Sonia, Chang, Angel X.

arXiv.org Artificial Intelligence

Among many skills that the agents need to possess, building and maintaining a semantic map of the environment is most crucial in long-horizon tasks. A semantic map captures information about the environment in a structured way, allowing the agent to reference it for advanced reasoning throughout the task. While existing surveys in embodied AI focus on general advancements or specific tasks like navigation and manipulation, this paper provides a comprehensive review of semantic map-building approaches in embodied AI, specifically for indoor navigation. We categorize these approaches based on their structural representation (spatial grids, topological graphs, dense point-clouds or hybrid maps) and the type of information they encode (implicit features or explicit environmental data). We also explore the strengths and limitations of the map building techniques, highlight current challenges, and propose future research directions. We identify that the field is moving towards developing open-vocabulary, queryable, task-agnostic map representations, while high memory demands and computational inefficiency still remaining to be open challenges. This survey aims to guide current and future researchers in advancing semantic mapping techniques for embodied AI systems.


Reviews: Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals

Neural Information Processing Systems

This work extends convolutional sparse coding to the multivariate case with a focus on multichannel EEG decomposition. This corresponds to a non-convex minimization problem and a local minimum is found via an alternating optimization. Reasonable efficient bookkeeping (precomputation of certain factors, and windowing for locally greedy coordinate descent) is used to improve scalability. The locally greedy coordinate descent cycles time windows, but computes a greedy coordinate descent within each window. As spatial patterns are essential for understanding EEG, this multivariate extension is an important contribution.


Multiscale Neuroimaging Features for the Identification of Medication Class and Non-Responders in Mood Disorder Treatment

Baker, Bradley T., Salman, Mustafa S., Fu, Zening, Iraji, Armin, Osuch, Elizabeth, Bockholt, Jeremy, Calhoun, Vince D.

arXiv.org Artificial Intelligence

In the clinical treatment of mood disorders, the complex behavioral symptoms presented by patients and variability of patient response to particular medication classes can create difficulties in providing fast and reliable treatment when standard diagnostic and prescription methods are used. Increasingly, the incorporation of physiological information such as neuroimaging scans and derivatives into the clinical process promises to alleviate some of the uncertainty surrounding this process. Particularly, if neural features can help to identify patients who may not respond to standard courses of anti-depressants or mood stabilizers, clinicians may elect to avoid lengthy and side-effect-laden treatments and seek out a different, more effective course that might otherwise not have been under consideration. Previously, approaches for the derivation of relevant neuroimaging features work at only one scale in the data, potentially limiting the depth of information available for clinical decision support. In this work, we show that the utilization of multi spatial scale neuroimaging features - particularly resting state functional networks and functional network connectivity measures - provide a rich and robust basis for the identification of relevant medication class and non-responders in the treatment of mood disorders. We demonstrate that the generated features, along with a novel approach for fast and automated feature selection, can support high accuracy rates in the identification of medication class and non-responders as well as the identification of novel, multi-scale biomarkers.